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Abstract 

Piwi-interacting RNAs (piRNAs) and CRISPR RNAs (crRNAs) are two recently discovered classes of small noncod- 
ing RNA that are found in animals and prokaryotes, respectively. Both of these novel RNA species function as com- 
ponents of adaptive immune systems that protect their hosts from foreign nucleic acids — piRNAs repress 
transposable elements in animal germlines, whereas crRNAs protect their bacterial hosts from phage and plasmids. 
The piRNA and CRISPR systems are nonhomologous but rather have independently evolved into logically similar 
defense mechanisms based on the specificity of targeting via nucleic acid base complementarity. Here we review 
what is known about the pi RNA and CRISPR systems with a focus on comparing their evolutionary properties. In 
particular, we highlight the importance of several factors on the pattern of piRNA and CRISPR evolution, including 
the population genetic environment, the role of alternate defense systems and the mechanisms of acquisition of 
new piRNAs and CRISPRs. 
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INTRODUCION 

In the last decade, a myriad of novel noncoding 
RNA species have been discovered, many of 
them very small in size (~20— 40 nt) [1]. Molecular 
pathways that involve small noncoding RNAs 
binding to an Argonaute protein are often referred 
to as JiN/4i-related pathways. There are several 
classes of Argonaute proteins, most famously the 
Ago-class Argonautes that bind to microRNAs and 
that are firmly established as fundamentally important 
regulators of gene expression in many areas of 
animal, plant and viral biology [2]. In this review, 
we will focus on small RNAs that bind to Piwi-class 
Argonaute proteins, called Pi'iM-interacting RNAs 
(piRNAs). Piwi proteins are only found in animals 
in which they are generally found highly expressed 
in the germline. Consistently, the best understood 
function of piRNAs is their role in defense against 
transposable elements in the germline. 

There are also several classes of small noncoding 
RNAs that do not participate in RNAi-related 



pathways, and in this review, we will also discuss 
the CRISPR (clustered regularly interspaced short 
palindromic repeats) RNAs (crRNAs) that are 
found only in prokaryotes. Although there exists 
an archaeal Argonaute homolog [3], crRNAs do 
not bind to Argonaute proteins. Indeed, it has 
been suggested that the archaeal Argonaute may 
have a role in DNA rather than RNA modification 
[4] and that it may be involved in a completely 
separate prokaryotic defense mechanism from the 
CRISPR system [5]. Instead, the crRNAs bind to 
a different protein called Cos. Together, the prokary- 
otic CRISPR-Cas system functions as an adaptive 
defense mechanism against phage and plasmids. 

Despite their lack of homology, there is a very 
clear logical similarity between the piRNA and 
CRISPR systems. In both systems, sequences from 
the invading nucleic acid are incorporated into 
specific loci in the host genome. When these 
sequences are transcribed and processed into small 
RNAs, the small RNAs can then guide repressive 
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molecular complexes to invading nucleic acids in 
trans. Both the piRNA and CRISPR systems are 
thus clear examples of Lamarckian mechanisms in 
which environmental factors directly cause heritable 
genetic changes [6] . The similarity between piRNAs 
and CRISPRs has been observed many times — 
indeed CRISPRs were at one time speculated to 
be an RNAi-related system [7] — and there are 
already a number of recent reviews of piRNA 
[8, 9] and CRISPR biology [10-12]. However, 
there has been much less consideration of the 
evolutionary properties of these two fascinating mo- 
lecular systems. Accordingly, our goal in this review 
is to survey the literature on piRNAs and CRISPRs 
with an emphasis on aspects of their biology most 
relevant to their evolution and to highlight factors 
which may cause the two systems to behave differ- 
ently from an evolutionary standpoint. 



OVERVIEW OF THE PIRNA SYSTEM 

Although hints of the piRNA system were observed 
in Drosophila as early as 2001 [13], piRNAs were 
definitively discovered by several groups independ- 
ently in 2006 by immunoprecipitating Piwi protein 
from mammalian testis and sequencing the bound 
small RNAs [14-17]. PiRNAs are -26-30 nt in 
mammals although their lengths can be slightly 
different in other animals. PiRNAs have essentially 
no known defining sequence characteristics beyond a 
very strong propensity for a 5' -uridine and a weaker 
bias toward an adenosine at position 10. PiRNAs are 
in general difficult to predict bioinformatically and 
must instead be defined biochemically. However, 
protocols for immunoprecipitating Piwi protein are 
still an active area of research [18] and there are no 
definitive sets of piRNA genes yet because the popu- 
lation of piRNAs is typically very large (in the hun- 
dreds of thousands) and complex. Caenorhabditis 
elegans piRNAs may be significantly different from 
mammalian and Drosophila piRNAs because they 
have a different length (21 nt), and there appears to 
be a conserved promoter motif upstream of many 
piRNAs [19], suggesting that each piRNA is a sep- 
arate transcription unit, unlike piRNAs in mammals 
and Drosophila which are typically expressed in long 
polycistronic transcripts. 

Unlike other small RNAs from the RNAi-related 
pathway, such as microRNAs and small interfering 
RNAs, which are produced from double-stranded 
intermediates by the Dicer enzyme, piRNAs are 



thought to be produced from long polycistronic 
RNA transcripts by a Dicer-independent mechanism 
in mammals and Drosophila. Note that unlike the 
CRISPR system described below, piRNA popula- 
tions are very complex and piRNAs appear to be 
produced by quasi-random cleavage of the primary 
piRNA transcript [20]. That is, while piRNAs 
almost always start with a U and there are biases 
for particular sequences to be cleaved as piRNAs, 
there is a strong random component that determines 
which sequences of the primary transcript are 
processed into piRNAs (hence the term 'quasi- 
random'). PiRNA 3'-end formation is poorly under- 
stood and is an object of active research [21]. 
However, piRNA 5'-end formation was addressed 
by several key papers [22, 23]. The authors studied 
master loci that control transposable element prolif- 
eration in Drosophila but were molecularly uncharac- 
terized for many years because of the apparent lack of 
functional sequences at the loci, other than a jumble 
of transposable element insertions. These master loci 
were found to produce piRNAs that repress trans- 
posable elements in trans [22] (Figure 1). The authors 
proposed the Ping-Pong mechanism [22, 23] in 
which primary piRNAs cleave sense transposon 
transcripts and simultaneously produce secondary 
piRNAs from the sense transposons that then 
cleave antisense transposon transcripts. This mechan- 
ism thus depends on the transcription of both sense 
and anti-sense transposon transcripts. An alternate 
view is that piRNAs are in fact produced through 
a double-stranded intermediate [24] based on the 
recent reports of the existence of an RNA- 
dependent RNA polymerase in Drosophila [25]. 
However, the existence of a Drosophila RDRP 
remains controversial, and this view remains a 
minority interpretation at the present time. 

Since the Ping-Pong mechanism is a positive 
feedback loop, one question is how the Ping-Pong 
mechanism is started in the first place. In Drosophila, 
a partial answer is provided by the fact that piRNAs 
are deposited maternally into the embryo [26, 27]. 
PiRNAs can thus be inherited epigenetically across 
generations. A second answer comes from evidence 
in Drosophila, where primary piRNAs are produced 
in the somatic follicle cells and delivered to the 
germline to start the Ping-Pong cycle [28, 29]. 
A similar mechanism is found in Arabidopsis for a 
different class of small RNAs [30], suggesting that 
this may be a universal mechanism where trans- 
posons are activated outside of the germline to 
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Figure I: PiRNAs expressed from discrete loci in the Drosophila genome (X-TAS and Flamenco) repress transpos- 
able elements in trans (gypsy, P-element, Idefix, ZAM). Reprinted from 'Mighty Piwis Defend Germline against 
Genome Intruders', K. A. O'Donnell and J.D. Boeke, Cell 2007; l29(l):37-44, with permission from Elsevier. 
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generate small RNAs, thus reducing the chance of 
deleterious transposon insertions in the germline. A 
third possibility is suggested by a related system of 
RNAi and heterochromatin formation in fission 
yeast, in which degradation products from random 
abundant transcripts are used to prime Argonaute 
proteins and start a positive feedback loop [31]. 

Aside from the piRNAs that are derived from 
repetitive elements and involved in the Ping-Pong 
mechanism, there are classes of piRNAs that are not 
repetitive. For example, the piRNA populations ex- 
pressed at different stages of mammalian testis devel- 
opment are distinct and those found at the pachytene 
stage are depleted in repetitive sequences [32]. In 
addition, some piRNAs are found in genes and are 
assumed to repress their host transcripts [33] . Finally, 
there is some evidence that piRNAs are functional in 
the brain in rat [34]. The connection between neural 
expression of piRNAs and the expression of trans- 
posable elements in the mammalian brain [35] has 
been observed and is clearly intriguing, but there is 
currently no evidence to further connect these two 
aspects of neuroscience. In the rest of this review, we 
will focus on the repetitive piRNAs that are involved 
in the Ping-Pong mechanism and repress transpos- 
able elements because they are much better under- 
stood than the nonrepetitive piRNAs. 

Overview of piRNA evolution 

The piRNA system is known to be ancient as Piwi 
proteins, and the Ping-Pong signature are conserved 



in basal metazoans [36]. However, no Piwi homo- 
logs have been found outside animals so the piRNA 
system appears to be an animal-specific innovation. 
Between closely related species, the genomic loca- 
tions of many piRNA clusters are conserved, but 
the sequences of the piRNAs themselves are not 
conserved between rat and mouse [37], C. elegans 
and C. briggsae [19] or Drosophila melanogaster and 
D. simulans [38]. Thus, the overall picture of 
piRNA evolution at the sequence level is one of 
very rapid evolution. 

A recent study of human piRNAs by one of the 
authors suggested that there is strong negative selec- 
tion at the sequence level for human piRNAs but 
only in the three African populations and not any of 
the eight non- African populations studied [39]. This 
observation is consistent with a recent report that 
African populations have much higher rates of trans- 
poson insertion than other populations [40]. A fur- 
ther intriguing observation from the analysis of 
human piRNAs and transposable elements is the 
depletion of piRNA matches in the reverse tran- 
scriptase region of human LINE-1 elements, 
though not mouse LINE-1 elements [39]. This 
observation suggests the possibility that at least one 
reverse transcriptase might be functional for the host 
and therefore protected from piRNA-mediated 
repression. 

Beyond sequence divergence, it is also interesting 
to study the relationship of piRNA clusters and copy 
number changes, as an increase in copy number 
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could potentially increase the level of gene 
expression of piRNAs. Assis and Kondrashov studied 
the evolution of piRNA clusters between mouse and 
rat and found a very high rate of piRNA cluster 
duplication, which they suggested is indicative of 
positive selection for higher expression level of 
piRNAs [37]. 

Although the piRNA system is not understood 
well enough for detailed mathematical modeling, 
there has been one attempt by Lu and Clark [41] 
at modeling piRNA-transposable element co-evolu- 
tion using computer simulations. From their simula- 
tion, they suggested that retrotransposon insertions 
that are repressed by piRNAs can reach high fre- 
quencies or even be fixed in the population because 
their deleterious effect is attenuated by piRNA 
repression. 

The idea that the piRNA pathway and transpos- 
able elements might co-evolve in a Red Queen-like 
scenario has been explored by a number of authors. 
In this scenario, alternating rounds of adaptation and 
counter-adaptation would lead to increased rates of 
positive selection. In a molecular evolution analysis 
examining species across the Drosophila genus, it was 
found that a higher transposable element abundance 
is positively correlated with greater codon bias in 
piRNA pathway genes but not an increased rate of 
amino acid substitution in these genes [42]. The 
authors suggested that these observations indicate 
that positive selection on piRNA pathway genes 
occurs mainly at the level of translation efficiency 
mediated by codon usage (although other explan- 
ations for codon bias are possible) as opposed to 
amino acid substitution [42] . Further, a resequencing 
study of a number of defense genes in D. melanogaster 
and D. simulans concluded that RNAi genes have 
the highest rate of adaptive evolution over all 
immune-system genes [43]. Subsequent studies also 
found recurrent adaptation across the twelve 
sequenced Drosophila genomes for a number of 
piRNA pathway genes, including SPN-E, AUB, 
KRIMP, SQU and ZUC [44], as well as Rhino 
[45]. Overall, these studies are consistent with ele- 
vated rates of evolution on piRNA pathway genes, 
consistent with its role in genome defense. While the 
molecular details of the Red Queen scenario for 
piRNAs and transposable elements are unclear, cer- 
tain aspects of transposable element evolution, such 
as a higher global transposition rate, could select for 
certain features of piRNA-pathway genes, such as 
stronger binding affinity of the proteins for piRNAs. 



PiRNAs and phenotypic capacitors 

An interesting and somewhat contentious aspect of 
the role of the piRNA system in evolution is its role 
in canalization. Canalization, most famously asso- 
ciated with Waddington [46], refers to the buffering 
of genetic or environmental insults to ensure devel- 
opmental robustness. In a seminal paper, Rutherford 
and Lindquist [47] suggested that Hsp90, a protein 
chaperone, is a phenotypic capacitor in Drosophila, 
meaning that it buffers genetic variation but when 
it is compromised, that variation is revealed in mul- 
tiple mutant phenotypes, at least some of which 
could be adaptive in certain environments [48]. 
Similar results were subsequently demonstrated in 
Arabidopsis [49], suggesting that Hsp90 might play 
an evolutionarily conserved role as a phenotypic 
capacitor. 

The connection between canalization and the 
piRNA system comes from a recent report that in 
Drosophila, Hsp90 regulates the piRNA pathway, 
which in turn regulates the insertion of transposons 
[50] . It was further suggested that Hsp90 interacts in 
a protein complex with Piwi protein and mediates 
canalization by epigenetic silencing of genetic vari- 
ation and suppressing transposon insertion [51]. 
Thus, one potential mechanism by which the dis- 
ruption of Hsp90 creates phenotypic variation is not 
by revealing previously cryptic variation as suggested 
by Rutherford and Lindquist but rather through 
de novo mutations generated by transposon insertions. 
For this to be true, a strong bias in the preference in 
genome position for transposition insertion de- 
pendent on genetic background is required, and 
while such a preference is known to exist, it is not 
clear if it is strong enough to fully explain the results 
of the Rutherford and Lindquist experiments. Also, 
the piRNA study [51] showed an effect on gene 
regulation separable from the effect on transposons. 
Conversely, imprecise transposon deletions could 
have a mutagenic effect and would necessarily be 
in the same place in the genome so more work 
needs to be done to define the exact role of 
piRNAs in canalization. 

OVERVIEW OF CRISPRS 

CRISPR loci were initially reported simply as 
arrays of DNA repeats in Escherichia coli [52] in 1987 
and subsequently named 'CRISPR' in 2002 when it 
was observed that such arrays were common in pro- 
karyotes [53]. In 2005, several groups found that 
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CRISPR spacers often have similarity to foreign 
DNAs, especially phage, suggesting a role in cellular 
defense [54, 55]. Finally, molecular experiments in 
2007 and 2008 showed that CRISPRs indeed confer 
immunity to phage [56] and plasmids [57] (Figure 2). 

Databases of CRISPRs from sequenced prokary- 
otic genomes have been created [58] and current 
estimates indicate that nearly all archaea and about 
half of all bacteria contain CRISPRs. A prokaryotic 
cell can contain one or more CRISPR cassettes that 
are made up of alternating sequences of repeats and 
spacers. The spacers encode the functional RNA 
units that are often homologous to phage or plasmid 
sequences and can direct cleavage of those molecules 
in trans. The repeats may be recognized by a protein 
that processes the long RNA transcript into individ- 
ual spacer units. The number of CRISPR loci per 
genome ranges from 1 to 15, and each varies in 
length from several to a few hundred spacers, up to 
a longest known locus containing 587 spacers. The 
repeat and spacer sizes are typically 21— 48 bp for re- 
peats and 26— 72 bp for spacers. In general, the 
CRISPR repeats are not conserved at the sequence 
level beyond a few short conserved sequences such as 
GTTTg/c at the 5'-end and GAAAC at the 3'-end. 

The mechanism of CRISPR-mediated defense 
seems to depend on the particular prokaryotic 
species, and crRNAs can direct cleavage of either 
DNA [59] or RNA [60]. However, the distinction 
between DNA and RNA targeting will not be 
important for the evolutionary perspective we 
adopt here. CrRNAs bind to Cas (CRISP- 
associated) proteins, which are typically encoded in 
the genome close to the CRISPR array. Several of 
the Cas proteins together form the Cascade complex. 
The classification of CRISPR-Cas systems has been 
revised a number of times, but the most recent study 
classifies them into three major types (Types I— III) 
[61]. The details of these different Cas systems are 
beyond the scope of this manuscript, and the inter- 
ested reader is referred instead to other recent 
reviews on the subject [62]. The Cas proteins func- 
tion in three distinct steps — to integrate nucleic acid 
fragments as new spacers, to cleave the precursor 
crRNA and finally to cleave the target. 

Review of CRISPR evolution 

One of the most interesting aspects of CRISPR loci 
is their linear organization — new CRIPSR spacers 
are always inserted at one end of the locus — which 
makes them a unique temporal record of past phage 



invasions [12]. Note that there are occasionally 
deletions of spacers so the linear ordering is only 
approximate. CRISPRs have also been well-studied 
in the context of metagenomics [63] where the 
simple presence of CRISPR cassettes is sufficient to 
link prokaryote species to their phage invaders. 

CRISPRs can be located on plasmids and hori- 
zontally transferred between prokaryotes [64], and 
indeed CRISPRs are believed to have originated 
in thermophilic Archaea before spreading via hori- 
zontal transfer to other prokaryotes [65]. Conversely, 
CRISPRs can prevent horizontal gene transfer by 
repression of plasmids and thus may contribute to 
the formation of independent bacterial lineages, 
similar to other prokaryotic repressive mechanisms 
of plasmids [66]. 

One study that looked at the distribution of 
CRISPR cassettes in 290 strains of E. coli found 
that closely related strains generally had identical 
CRISPRs, whereas distantly related strains had com- 
pletely different CRISPRs, suggesting rare but dra- 
matic change in CRISPR spacer content over 
evolutionary time rather than small gradual changes 
[67] . Metagenomic studies of CRISPRs showed that 
there is a history of selective sweeps at CRISPR loci 
[68] and a history of polymorphism at old CRISPR 
spacers [69], consistent with their role in genome 
defense. Conversely, phage are known to escape 
CRISPR targeting by mutation and deletion of 
bases [59, 70] or shuffling of sequences [63]. 
Finally, a recent molecular evolution study of Cas 
gene evolution found patterns of relatively fast evo- 
lutionary change, consistent with a co-evolutionary 
arms-race between CRISPRs and phage [71]. 

Mathematical models 

The elegance of the CRISPR system has attracted the 
attention of a number of modeling groups, who have 
attempted to design simple mathematical models of 
CRISPR evolution, generally based on simple ordin- 
ary differential equations. In one study by Bruce 
Levin, the population dynamics of CRISPRs in bac- 
terial populations growing in a chemostat were mod- 
eled using standard chemostat models [72]. The 
biological significance of this model has not yet 
been shown, but one could perform the actual ex- 
periment of growing phage and bacteria together in a 
chemostat and tracking the dynamics of their 
population growth over time to directly test the pre- 
dictions of the model. Among other simple mathem- 
atical models [73—75], Haerter et al. [73] suggested 
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Figure 2: (A) Sequences from viruses or plasmids are cleaved into novel spacers and inserted into discrete CRISPR 
array loci. (B) The CRISPR array is transcribed into a pre-crRNA which is processed into individual crRNAs. These 
small RNAs are bound by proteins from the Cas complex and used to guide the Cas proteins to target invading nu- 
cleic acids. PAM (protospacer-associated motif) distinguishes self from nonself to prevent autoimmunity. From 
'CRISPR/Cas, the Immune System of Bacteria and Archaea', P. Horvath and R. Barrangou, Science 327:5962, 20I0. 
Reprinted with permission from AAAS. 



that phage and bacteria can coexist even when the 
phage are much more diverse than the capacity of the 
CRISPR system, while He and Deem [74] used their 
model to show that the 5' most spacer is expected to 
be the most diverse. A recent study by Childs etal. [75] 
described an explicit eco-evolutionary model of 
CRISPR evolution that produced many insights, 



including that CRISPRs induce host and viral diver- 
sification, punctuated replacement of strains and the 
emergence of coalitions of dominant host strains. 

In our own agent-based simulations of bacteria 
and phage evolving with CRISPR, we observe 
two evolutionary stable modes depending on the 
cost of resistance, which we can interpret as a 
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combination of the energetic cost of expressing the 
CRISPR and Cas transcripts and proteins with the 
fitness cost of occasional errors in self versus nonself 
discrimination that result in cleavage of the host pro- 
karyotic genome. In our simulations, we observed 
one mode where the individuals hold a large 
number of CRISPRs and one mode where each 
individual holds a small number of CRISPRs, but 
in either case the bacterial population as a whole has 
a high diversity of CRISPRs, thus conferring group 
resistance to the phage (M. S. Kumar and K. 
C. Chen, unpublished data). These results are quali- 
tatively similar to previous simulation results for the 
restriction— modification system in bacteria [76]. 

DIFFERENCES BETWEEN PIRNA 
AND CRISPR EVOLUTION 

The piRNA and CRISPR systems are clearly very 
similar in their overall molecular logic, and the pres- 
ence of such an RNA-based mechanism in all three 
kingdoms of hfe suggests that RNA may be uniquely 
well-suited for genome defense against foreign 
nucleic acids. However, there are also significant 
differences between piRNAs and CRISPRs, which 
affect their evolution, as we discuss in this section. 

Significance of the population genetic 
environment 

While both the piRNA and CRISPR pathways are 
expected to show elevated rates of evolution con- 
sistent with the Red Queen dynamics often seen in 
host— pathogen interactions, the rate of evolution 
might still be very different between the two sys- 
tems because of their different population genetic 
environments. Three important aspects of the 
population genetic environment that merit consid- 
eration are the effective population size, the gener- 
ation time of the organisms involved and the 
mutation rate. 

From population genetics theory, the effective 
population size for a transposable element family is 
the effective population size of the host species 
multiplied by the average number of active copies 
of the transposable element per haploid genome [77]. 
The second quantity — the average number of ac- 
tive copies of the transposable element — varies by 
species and transposable element family. As a con- 
crete example, there are estimated to be 80—100 
active LINE-1 elements in the human genome. 
In the piRNA-transposable element system, the 



transposable elements are embedded in the host 
genome and thus are constrained to be replicated 
in the same generation time as the host genome. 
The transposable elements also have roughly the 
same mutation rate as the host genome. Even if 
the transposable elements are biased to certain parts 
of the genome, the differences in the local mutation 
rate are relatively minor. The mutation rate is a 
significant consideration because there can be muta- 
tions in transposable elements that are countered by 
compensatory mutations in piRNAs. This is a differ- 
ent mechanism of host response to the mechanism of 
incorporation of new piRNA sequences, which we 
discuss below in the section, 'Significance of the 
insertion mechanism of new CRISPRs/piRNAs'. 

In contrast to the piRNA-transposable element 
system, in the CRISPR-phage system, the phage 
are autonomous and typically have a census popula- 
tion size much larger than their prokaryotic hosts. 
Although it is difficult to estimate effective popula- 
tion sizes for bacteria or phage, let alone average 
effective population sizes over all bacteria and all 
phage that engage in the CRISPR system, a reason- 
able estimate is that the effective population size of 
phage is significantly higher than prokaryotes based 
on the large difference in census population sizes. A 
higher effective population size would imply a 
higher effectiveness of natural selection for phage 
compared to prokaryotes. Overall, we expect similar 
phage and prokaryotic generation times since the 
phage lysis time should be correlated with the pro- 
karyotic cell division time. However, in some sys- 
tems, phage have a faster generation time than their 
host bacteria, since the phage can lyse cells and 
reproduce on a time scale faster than a bacterial cell 
division. Furthermore, in each generation, many 
phage can be produced whereas the bacterial popu- 
lation is only doubled. Finally, phage mutation rates 
are often significantly (10—100 times) higher than 
bacteria, both for phage with DNA genomes and 
those with RNA genomes [78]. 

In sum, despite the scarcity of precise measure- 
ments of the relevant population genetic parameters 
and generalizing over a very large phylogenetic 
range, we observe the following broad patterns. 
Both phage and transposable elements have higher 
effective population sizes than their host species. 
Both phage and transposable element also have simi- 
lar generation time to their host species, though the 
phage generation time can be faster and rate of 
growth higher compared to transposable elements 
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which are constrained to have exactly the same gen- 
eration time as their host. Thus, the largest difference 
between the population genetic enviroments of the 
piRNA and CRISPR systems is the mutation rate 
since the phage can mutate at a much faster rate and 
evade CRISPR-mediated repression than transpos- 
able elements can evade piRNA-mediated repres- 
sion. These observations suggest that CRISPRs 
would be a relatively ineffective defense mechanism 
against phage compared to the efficacy of piRNAs 
against transposable elements. However, CRISPRs 
are backed up by alternative defense mechanisms, 
and the insertion rate of new CRISPR spacers may 
be much higher as discussed in the next two sections. 

Significance of alternative mechanisms 
of genomic defense 

It is important to place the defense mechanisms we 
are discussing in the context of other defense mech- 
anisms in the cell. Indeed, given the intense pressure 
placed on prokaryotes by fast evolving phage, it is 
unsurprising that prokaryotes have evolved multiple 
redundant defense mechanisms. Several such defense 
mechanisms have been discussed in the literature, 
including the restriction— modification system (RM) 
[79] and envelope resistance [80]. A restriction- 
modification system consists of a pair of enzymes 
that recognize the same short DNA sequence. The 
restriction enzyme cuts all unmethylated target 
sequences while a methylase acts to methylate all 
of the host target sequences. The RM system thus 
serves as a defense mechanism against invading 
phage. Since restriction enzymes target short 
(roughly 6— 8 bp) sites, each enzyme can target 
many phage genomes. Each CRISPR spacer, how- 
ever, is in principle constrained to target a specific 
phage because of the requirement for complemen- 
tarity to the entire RNA sequence (though see 
below on the possibility of CRISPR seed sequence). 
On the other hand, it is much faster to evolve a new 
crRNA, which can be produced by a Lamarckian 
mechanism, than a new RM system, which requires 
classical Darwinian evolution by random mutation 
and selection. The RM and CRISPR systems thus 
have different properties that may allow them to 
work well together in genome defense. In fact, 
Abedon suggested that CRISPRs are subsidiary to 
other defense mechanisms [81] using an argument 
similar to the logic of the vertebrate immune re- 
sponse where an innate immune response provides 
the first line of defense and an adaptive immune 



system the second hne of defense. In the prokaryotic 
context, RM systems might play the role of the 
innate immune response and CRISPRs the adaptive 
immune system. Experimental evolution results have 
also indicated that envelope resistance (i.e. a struc- 
tural modification that prevents adsorption of any 
phage into the cell) often develops in response to 
phage in lab cultures [82]. 

In the case of piRNAs, our population genetic 
arguments above suggest that they might be more 
effective at repressing transposable elements than 
CRISPRs are at repressing phage. Nonetheless, 
other molecular mechanisms also play a significant 
role in the repression of transposable elements in 
the germline. One important mechanism is DNA 
methylation that prevents transcription of transpos- 
able elements in the germline. While little is under- 
stood about the evolutionary properties of DNA 
methylation or how DNA methyl marks are directed 
to specific loci in the genome, intriguingly, piRNAs 
are also implicated in the maintenance of DNA 
methylation in mammals [83, 84]. When this mech- 
anism is more fully worked out at the molecular 
level, it may be possible to start understanding the 
interplay between piRNA-mediated regulation of 
transposable elements at the chromatin level versus 
the RNA level. On a broader scale, RNAi-related 
systems in general are known to be involved in 
genome defense [85] and may have even originated 
for that purpose. 

Significance of the insertion mechanism 
for new CRISPRs/piRNAs 

In the CRISPR system, the CAS proteins provide an 
active mechanism for inserting new phage sequences 
into CRISPR loci. In principle, this should allow 
very fast adaptation to novel phage attacks, in con- 
trast to the piRNA system, as described below. An 
intriguing aspect of the CRISPR system is the linear 
arrangement of CRISPR spacers since the newest 
CRISPR RNAs are inserted at the 5'-end of 
the CRISPR cassette. For the evolutionist, this 
arrangement conveniently gives the temporal history 
of phage infections, with the caveat that occasional 
deletions of spacers make the history only approxi- 
mate [12]. For the prokaryotic host, it is still not clear 
if there is any biological significance to this arrange- 
ment. One potential benefit to the host could come 
from RNA polymerase drop off: since the entire 
CRISPR cassette is transcribed as a long transcript 
from which individual spacers are cleaved, even if the 
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polymerase falls off the elongating transcript, relevant 
CRISPR spacers that target currently active phage 
will still be expressed. 

Unlike the CRISPR case, there is no analogous 
linear organization for piRNAs. Current evidence 
suggests that the transposons either jump randomly 
into piRNA loci or perhaps have a mild preference 
for inserting into the piRNA loci [86]. Selection for 
relevant piRNA insertions then occurs at the level of 
individual germ cells, and in this way adaptation to 
the invasion of the new transposable element can 
occur over the lifetime of an individual [86]. 
Nonetheless, because the mechanism for new 
piRNA insertion is close to random, it appears to 
be more inefficient than the mechanism for new 
CRISPR insertion. Thus, although the population 
genetic arguments above suggest that piRNAs may 
be more effective than CRISRPs at repressing their 
targets, this may be countered by their more ineffi- 
cient acquisition mechanism. 

Significance of the CRISPR/piRNA 
targeting mechanism 

Independent of their role as transposon repressors, 
piRNAs appear to have a role in the control of 
endogenous gene expression. Such roles include 
the control of mRNA translation, direction of both 
euchromatic and heterochromatic histone modifica- 
tions and control of higher order chromatin struc- 
tures [87]. These nontransposon related roles are 
expected to apply different evolutionary pressure to 
some piRNAs, perhaps more similar to the evolu- 
tionary properties of microRNAs [88]. It is not clear 
yet whether CRISPRs regulate host— gene expres- 
sion, but it is certainly conceivable that they have 
been co-opted by the cell for this purpose. 

The mechanism of crRNA targeting is a matter of 
debate in the current literature. Initial experiments 
showed that even one mismatch was enough to pre- 
vent CRISPR-mediated silencing [56]. Since this 
result cannot be explained thermodynamically, one 
possibility is that there is another system that senses 
the mismatch and prevents silencing. However, later 
studies suggested a less stringent requirement for base 
pairing [89, 90], and more recently evidence for a 
7-nt seed sequence in E. coli CRISPRs [91] was pre- 
sented, reminiscent of microRNA seeds [92]. The 
existence of a seed sequence would be highly signifi- 
cant for the evolution of CRISPRs since it would 
drastically reduce the amount of sequence in the 
CRISPR spacer under selective constraint and 



allow for rapid evolution of new targets. In contrast, 
there is currently no evidence for a seed sequences in 
piRNAs and current evidence points to a require- 
ment for nearly complete complementarity over the 
full length of the piRNA for targeting. Another 
interesting feature of piRNAs is that there are 
many redundant piRNA sequences, whereas the 
same does not appear to be true for CRISPRs. 
Redundancy would also serve to reduce the evolu- 
tionary constraint on individual piRNA sequences. 



CONCLUSIONS 

In this review, we have compared the evolution of 
two recently discovered RNA-based adaptive de- 
fense mechanisms: the piRNA system in animals 
and the CRISPR system in prokaryotes. In the pro- 
cess, we have reviewed the aspects of piRNA and 
CRISPR biology that are most relevant for under- 
standing and modeling their evolution. Overall, the 
evolutionary logic of the two systems is strikingly 
similar despite their lack of homology, perhaps 
pointing to the fundamental importance of RNA- 
based mechanisms in genome defense. However, as 
discussed in this review, many aspects of their 
molecular biology confer different evolutionary 
properties to the two systems. Several of the most 
basic evolutionary properties that still remain to be 
elucidated are: (i) the rate of evolution of piRNAs 
and CRISPRs at the sequence level; (ii) the rate of 
evolution of piRNA and CRISPR-generating loci at 
the level of copy number variation; and (iii) the true 
amount of sequence in each piRNA and CRISPR 
that is under selective constraint — particularly the 
question of whether there is a seed sequence or 
not. Beyond these basic questions of molecular evo- 
lution are broader evolutionary questions such as the 
interplay of the piRNA and CRISPR systems with 
alternative defense mechanisms against foreign nu- 
cleic acids, such as DNA methylation in the case of 
piRNAs, or restriction— modification systems and 
envelope resistance in the case of CRISPRs. Once 
we can compare the different defense mechanisms, 
we can study the conditions under which the 
piRNA or CRISPR system might play important 
roles in evolution. For example, it has been sug- 
gested that the restriction— modification system is 
important for colonization of new habitats but not 
in stable communities [82]. 

There is still a long way to go in understanding the 
basic molecular biology of piRNAs and CRISPRs 
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and detailed quantitative models of their evolution 
are not easy to formulate at the present time. 
Nonetheless, we hope that by highlighting a 
number of conceptual evolutionary issues, we can 
help frame future experimental and computational 
studies of these important genetic mechanisms. 
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